24 research outputs found

    Evaluation of essential genes in correlation networks using measures of centrality

    Get PDF
    Correlation networks are emerging as powerful tools for modeling relationships in high-throughput data such as gene expression. Other types of biological networks, such as protein-protein interaction networks, are popular targets of study in network theory, and previous analysis has revealed that network structures identified using graph theoretic techniques often relate to certain biological functions. Structures such as highly connected nodes and groups of nodes have been found to correspond to essential genes and protein complexes, respectively. The correlation network, which measures the level of co-variation of gene expression levels, shares some structural properties with other types of biological networks. We created several correlation networks using publicly available gene expression data, and identified critical groups of nodes using graph theoretic properties used previously in other biological network studies. We found that some measures of network centrality can reveal genes of impact such as essential genes, suggesting that the correlation network can prove to be a powerful tool for modeling gene expression data. In addition, our method highlights the biological impact of nodes a set of high centrality nodes identified by combined measures of centrality to validate the link between structure and function in the notoriously noisy correlation network

    Recognizing Patient Safety Importance through Instrument Validation on Physicians\u27 Assessment of an EHR

    Get PDF
    Patient safety and high quality patient care are critical concerns for healthcare providers. The Institute of Medicine report suggests medical errors account for up to 98,000 patient deaths each year. Therefore, the US healthcare system is looking to information technology applications as one means of making patient care safer. This paper compares the psychometric properties of the Electronic Health Record Nurse Satisfaction instrument (based on the Health Information Technology Research-based Evaluation Framework) to our study that employed the same instrument but measured clinical physicians\u27 opinions of an EHR to determine if the instrument could be used across domains of users. Our results found the factor analysis and the clustering of the sub-scale items were different. We propose a two-factorial instrument that identifies the following dimensions: System Features/Performance and Data Quality/Accuracy. Another important contribution of this study is that patient safety was identified as a more salient indicator for physicians

    On the Comparison of State- and Transition-based Analysis of Biological Relevance in Gene Co-expression Networks

    Get PDF
    Traditional correlation network analysis typically involves creating a network using gene expression data and then identifying biologically relevant clusters from that network by enrichment with Gene Ontology or pathway information. When one wants to examine these networks in a dynamic way - such as between controls versus treatment or over time - a snapshot approach is taken by comparing network structures at each time point. The biological relevance of these structures are then reported and compared. In this research, we examine the same snapshot networks but focus on the enrichment of changes in structure to determine if these results give any more insight into the mechanisms behind observed phenotypes. Our main hypothesis is that more information, particularly related to potential dynamic changes, can be obtained through transition­-based analysis of biological networks. To test this hypothesis, we compare gene expression data from the mouse hippocampus at three different time points: young, middle-aged, and aged, and compare the traditional state-based approach to the dynamic transition-based enrichment approach. In this study we use a clustering approach (SPICi) designed specifically for clustering of large biological networks. The results of this study verify an inconsistency between traditional and dynamic structure identification approaches through biological enrichment. These results highlight an intriguing issue for those performing, critiquing, and using network based approaches in their research - that a black box or workflow type of approach typically used in network based research can be supplemented with a transition­-based approach to support movement from in silico to in vivo experimentation of target genes

    The Development of Parallel Adaptive Sampling Algorithms for Analyzing Biological Networks

    Get PDF
    The availability of biological data in massive scales continues to represent unlimited opportunities as well as great challenges in bioinformatics research. Developing innovative data mining techniques and efficient parallel computational methods to implement them will be crucial in extracting useful knowledge from this raw unprocessed data, such as in discovering significant cellular subsystems from gene correlation networks. In this paper, we present a scalable combinatorial sampling technique, based on identifying maximum chordal subgraphs, that reduces noise from biological correlation networks, thereby making it possible to find biologically relevant clusters from the filtered network. We show how selecting the appropriate filter is crucial in maintaining the key structures from the original networks and uncovering new ones after removing noisy relationships. We also conduct one of the first comparisons in two important sensitivity criteria— the perturbation due to the vertex numbers of the network and perturbations due to data distribution. We demonstrate that our chordal-graph based filter is effective across many different vertex permutations, as is our parallel implementation of the sampling algorithm

    An intelligent data-centric approach toward identification of conserved motifs in protein sequences

    Get PDF
    The continued integration of the computational and biological sciences has revolutionized genomic and proteomic studies. However, efficient collaboration between these fields requires the creation of shared standards. A common problem arises when biological input does not properly fit the expectations of the algorithm, which can result in misinterpretation of the output. This potential confounding of input/output is a drawback especially when regarding motif finding software. Here we propose a method for improving output by selecting input based upon evolutionary distance, domain architecture, and known function. This method improved detection of both known and unknown motifs in two separate case studies. By standardizing input considerations, both biologists and bioinformaticians can better interpret and design the evolving sophistication of bioinformatic software

    A Parallel Template for Implementing Filters for Biological Correlation Networks

    Get PDF
    High throughput biological experiments are critical for their role in systems biology – the ability to survey the state of cellular mechanisms on the broad scale opens possibilities for the scientific researcher to understand how multiple components come together, and what goes wrong in disease states. However, the data returned from these experiments is massive and heterogeneous, and requires intuitive and clever computational algorithms for analysis. The correlation network model has been proposed as a tool for modeling and analysis of this high throughput data; structures within the model identified by graph theory have been found to represent key players in major cellular pathways. Previous work has found that network filtering using graph theoretic structural concepts can reduce noise and strengthen biological signals in these networks. However, the process of filtering biological network using such filters is computationally intensive and the filtered networks remain large. In this research, we develop a parallel template for these network filters to improve runtime, and use this high performance environment to show that parallelization does not affect network structure or biological function of that structure

    A noise reducing sampling approach for uncovering critical properties in large scale biological networks

    Get PDF
    A correlation network is a graph-based representation of relationships among genes or gene products, such as proteins. The advent of high-throughput bioinformatics has resulted in the generation of volumes of data that require sophisticated in silico models, such as the correlation network, for in-depth analysis. Each element in our network represents expression levels of multiple samples of one gene and an edge connecting two nodes reflects the correlation level between the two corresponding genes in the network according to the Pearson correlation coefficient. Biological networks made in this manner are generally found to adhere to a scale-free structural nature, that is, it is modular and adheres to a power-law degree distribution. Filtering these structures to remove noise and coincidental edges in the network is a necessity for network theorists because unfortunately, when examining entire genomes at once, network size and complexity can act as a bottleneck for network manageability. Our previous work demonstrated that chordal graph based sampling of network results in viable models. In this paper, we extend our research to investigate how different orderings affect the results of our sampling, and maintain the viability of resulting network structures. Our results show that chordal graph based sampling not only conserves clusters that are present within the original networks, but by reducing noise can also help uncover additional functional clusters that were previously not obtainable from the original network

    A Novel Correlation Networks Approach for the Identification of Gene Targets

    Get PDF
    Correlation networks are emerging as a powerful tool for modeling temporal mechanisms within the cell. Particularly useful in examining coexpression within microarray data, studies have determined that correlation networks follow a power law degree distribution and thus manifest properties such as the existence of “hub” nodes and semicliques that potentially correspond to critical cellular structures. Difficulty lies in filtering coincidental relationships from causative structures in these large, noise-heavy networks. As such, computational expenses and algorithm availability limit accurate comparison, making it difficult to identify changes between networks. In this vein, we present our work identifying temporal relationships from microarray data obtained from mice in three stages of life. We examine the characteristics of mouse networks, including correlation and node degree distributions. Further, we identify high degree nodes (“hubs”) within networks and define their essentiality. Finally, we associate Gene Ontology annotations to network structures to deduce relationships between structure and cellular functions

    Identifying modular function via edge annotation in gene correlation networks using Gene Ontology search

    Get PDF
    Correlation networks provide a powerful tool for analyzing large sets of biological information. This method of high-throughput data modeling has important implications in uncovering novel knowledge of cellular function. Previous studies on other types of network modeling (protein-protein interaction networks, metabolomes, etc.) have demonstrated the presence of relationships between network structures and organization of cellular function. Studies with correlation network further confirm the existence of such network structure and biological function relationship. However, correlation networks are typically noisy and the identified network structures, such as clusters, must be further investigated to verify actual cellular function. This is traditionally done using Gene Ontology enrichment of the genes in that cluster. In this study a novel method to identify common cluster functions in correlation networks is proposed, which uses annotations of edges as opposed to the traditional annotation of node analysis. The results obtained using proposed method reveals functional relationships in clusters not visible by the traditional approach

    On the design of advanced filters for biological networks using graph theoretic properties

    Get PDF
    Network modeling of biological systems is a powerful tool for analysis of high-throughput datasets by computational systems biologists. Integration of networks to form a heterogeneous model requires that each network be as noise-free as possible while still containing relevant biological information. In earlier work, we have shown that the graph theoretic properties of gene correlation networks can be used to highlight and maintain important structures such as high degree nodes, clusters, and critical links between sparse network branches while reducing noise. In this paper, we propose the design of advanced network filters using structurally related graph theoretic properties. While spanning trees and chordal subgraphs provide filters with special advantages, we hypothesize that a hybrid subgraph sampling method will allow for the design of a more effective filter preserving key properties in biological networks. That the proposed approach allows us to optimize a number of parameters associated with the filtering process which in turn improves upon the identification of essential genes in mouse aging networks
    corecore